Efficient Robust Proper Learning of Log-concave Distributions

نویسندگان

  • Ilias Diakonikolas
  • Daniel M. Kane
  • Alistair Stewart
چکیده

We study the robust proper learning of univariate log-concave distributions (over continuous and discrete domains). Given a set of samples drawn from an unknown target distribution, we want to compute a log-concave hypothesis distribution that is as close as possible to the target, in total variation distance. In this work, we give the first computationally efficient algorithm for this learning problem. Our algorithm achieves the information-theoretically optimal sample size (up to a constant factor), runs in polynomial time, and is robust to model misspecification with nearly-optimal error guarantees. Specifically, we give an algorithm that, on input n = O(1/ǫ5/2) samples from an unknown distribution f , runs in time Õ(n8/5), and outputs a log-concave hypothesis h that (with high probability) satisfies dTV (h, f) = O(OPT) + ǫ, where OPT is the minimum total variation distance between f and the class of log-concave distributions. Our approach to the robust proper learning problem is quite flexible and may be applicable to many other univariate distribution families. Part of this work was performed while the author was at the University of Edinburgh. Supported in part by EPSRC grant EP/L021749/1 and a Marie Curie Career Integration grant. Supported in part by NSF Award CCF-1553288 (CAREER). Some of this work was performed while visiting the University of Edinburgh. Part of this work was performed while the author was at the University of Edinburgh. Supported by EPSRC grant EP/L021749/1.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions

We provide new results for noise-tolerant and sample-efficient learning algorithms under s-concavedistributions. The new class of s-concave distributions is a broad and natural generalization of log-concavity, and includes many important additional distributions, e.g., the Pareto distribution and t-distribution. This class has been studied in the context of efficient sampling, i...

متن کامل

Active and passive learning of linear separators under log-concave distributions

We provide new results concerning label efficient, polynomial time, passive and active learning of linear separators. We prove that active learning provides an exponential improvement over PAC (passive) learning of homogeneous linear separators under nearly log-concave distributions. Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sa...

متن کامل

Learning mixtures of structured distributions over discrete domains

Let C be a class of probability distributions over the discrete domain [n] = {1, . . . , n}. We show that if C satisfies a rather general condition – essentially, that each distribution in C can be well-approximated by a variable-width histogram with few bins – then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of k unknow...

متن کامل

Learning Halfspaces with Malicious Noise

We give new algorithms for learning halfspaces in the challenging malicious noise model, where an adversary may corrupt both the labels and the underlying distribution of examples. Our algorithms can tolerate malicious noise rates exponentially larger than previous work in terms of the dependence on the dimension n, and succeed for the fairly broad class of all isotropic log-concave distributio...

متن کامل

A smooth ROC curve estimator based on log-concave density estimates.

We introduce a new smooth estimator of the ROC curve based on log-concave density estimates of the constituent distributions. We show that our estimate is asymptotically equivalent to the empirical ROC curve if the underlying densities are in fact log-concave. In addition, we empirically show that our proposed estimator exhibits an efficiency gain for finite sample sizes with respect to the sta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1606.03077  شماره 

صفحات  -

تاریخ انتشار 2016